AITopics | self-normalized concentration

Collaborating Authors

self-normalized concentration

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalized Kernelized Bandits: Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds

Metelli, Alberto Maria, Drago, Simone, Mussi, Marco

arXiv.org Machine LearningAug-5-2025

We study the regret minimization problem in the novel setting of generalized kernelized bandits (GKBs), where we optimize an unknown function $f^*$ belonging to a reproducing kernel Hilbert space (RKHS) having access to samples generated by an exponential family (EF) noise model whose mean is a non-linear function $μ(f^*)$. This model extends both kernelized bandits (KBs) and generalized linear bandits (GLBs). We propose an optimistic algorithm, GKB-UCB, and we explain why existing self-normalized concentration inequalities do not allow to provide tight regret guarantees. For this reason, we devise a novel self-normalized Bernstein-like dimension-free inequality resorting to Freedman's inequality and a stitching argument, which represents a contribution of independent interest. Based on it, we conduct a regret analysis of GKB-UCB, deriving a regret bound of order $\widetilde{O}( γ_T \sqrt{T/κ_*})$, being $T$ the learning horizon, $γ_T$ the maximal information gain, and $κ_*$ a term characterizing the magnitude the reward nonlinearity. Our result matches, up to multiplicative constants and logarithmic terms, the state-of-the-art bounds for both KBs and GLBs and provides a unified view of both settings.

artificial intelligence, inequality, machine learning, (18 more...)

arXiv.org Machine Learning

2508.01681

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Weighted Linear Bandits for Non-Stationary Environments

Neural Information Processing SystemsJan-22-2025, 09:44:41 GMT

Update (after reading the rebuttals): After reading the rebuttal of authors, I have addressed my concerns on the novelty of the new self-normalized concentration, since the key point is that the coefficient of regularizer is changing. I indeed appreciate this work. The idea of this paper is natural but there indeed exist technical challenges, and the authors address these issues elegantly. So I think it deserves an acceptance. Nevertheless, there are still many typos in current verison besides those listed before, for example, in Theorem 2, eq.

bandit, concentration, self-normalized concentration, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

On the Sublinear Regret of GP-UCB

Whitehouse, Justin, Wu, Zhiwei Steven, Ramdas, Aaditya

arXiv.org Artificial IntelligenceAug-14-2023

In the kernelized bandit problem, a learner aims to sequentially compute the optimum of a function lying in a reproducing kernel Hilbert space given only noisy evaluations at sequentially chosen points. In particular, the learner aims to minimize regret, which is a measure of the suboptimality of the choices made. Arguably the most popular algorithm is the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm, which involves acting based on a simple linear estimator of the unknown function. Despite its popularity, existing analyses of GP-UCB give a suboptimal regret rate, which fails to be sublinear for many commonly used kernels such as the Mat\'ern kernel. This has led to a longstanding open question: are existing regret analyses for GP-UCB tight, or can bounds be improved by using more sophisticated analytical techniques? In this work, we resolve this open question and show that GP-UCB enjoys nearly optimal regret. In particular, our results yield sublinear regret rates for the Mat\'ern kernel, improving over the state-of-the-art analyses and partially resolving a COLT open problem posed by Vakili et al. Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel $k$. Applying this key idea together with a largely overlooked concentration result in separable Hilbert spaces (for which we provide an independent, simplified derivation), we are able to provide a tighter analysis of the GP-UCB algorithm.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.07539

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback